Overview

Dataset statistics

Number of variables13
Number of observations27378
Missing cells9811
Missing cells (%)2.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory10.9 MiB
Average record size in memory417.4 B

Variable types

NUM7
CAT6

Warnings

Budget Line has a high cardinality: 4163 distinct values High cardinality
Budget Line Description has a high cardinality: 1899 distinct values High cardinality
First Fiscal Year is highly correlated with Published DateHigh correlation
Published Date is highly correlated with First Fiscal YearHigh correlation
Fiscal Year 2 Amount is highly correlated with Fiscal Year 1 Amount and 2 other fieldsHigh correlation
Fiscal Year 1 Amount is highly correlated with Fiscal Year 2 AmountHigh correlation
Fiscal Year 3 Amount is highly correlated with Fiscal Year 2 Amount and 1 other fieldsHigh correlation
Fiscal Year 4 Amount is highly correlated with Fiscal Year 2 Amount and 2 other fieldsHigh correlation
Fiscal Year 5 Amount is highly correlated with Fiscal Year 4 AmountHigh correlation
Project Type Description is highly correlated with Project TypeHigh correlation
Project Type is highly correlated with Project Type DescriptionHigh correlation
Fiscal Year 5 Amount has 9640 (35.2%) missing values Missing
Fiscal Year 1 Amount is highly skewed (γ1 = 102.9564547) Skewed
Fiscal Year 2 Amount is highly skewed (γ1 = 98.30545073) Skewed
Fiscal Year 3 Amount is highly skewed (γ1 = 76.32076244) Skewed
Fiscal Year 4 Amount is highly skewed (γ1 = 67.77666991) Skewed
Fiscal Year 5 Amount is highly skewed (γ1 = 32.34754482) Skewed
Fiscal Year 1 Amount has 10517 (38.4%) zeros Zeros
Fiscal Year 2 Amount has 16259 (59.4%) zeros Zeros
Fiscal Year 3 Amount has 19858 (72.5%) zeros Zeros
Fiscal Year 4 Amount has 21110 (77.1%) zeros Zeros
Fiscal Year 5 Amount has 14676 (53.6%) zeros Zeros

Reproduction

Analysis started2020-12-12 20:10:00.983606
Analysis finished2020-12-12 20:10:08.783819
Duration7.8 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

Published Date
Real number (ℝ≥0)

HIGH CORRELATION

Distinct13
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean20182187.54
Minimum20160426
Maximum20200416
Zeros0
Zeros (%)0.0%
Memory size214.0 KiB
2020-12-12T15:10:08.835864image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum20160426
5-th percentile20160426
Q120170426
median20181010
Q320191025
95-th percentile20200416
Maximum20200416
Range39990
Interquartile range (IQR)20599

Descriptive statistics

Standard deviation13415.01729
Coefficient of variation (CV)0.0006646958989
Kurtosis-1.217536075
Mean20182187.54
Median Absolute Deviation (MAD)10015
Skewness-0.1158771836
Sum5.525277483e+11
Variance179962688.9
MonotocityNot monotonic
2020-12-12T15:10:08.900419image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%) 
20200116410615.0%
 
2019020719677.2%
 
2017110619637.2%
 
2018101019607.2%
 
2020041619587.2%
 
2018020119587.2%
 
2017042619577.1%
 
2018042619507.1%
 
2019102519477.1%
 
2019042519467.1%
 
2016042618966.9%
 
2016102618896.9%
 
2017012418806.9%
 
(Missing)1< 0.1%
 
ValueCountFrequency (%) 
2016042618966.9%
 
2016102618896.9%
 
2017012418806.9%
 
2017042619577.1%
 
2017110619637.2%
 
2018020119587.2%
 
2018042619507.1%
 
2018101019607.2%
 
2019020719677.2%
 
2019042519467.1%
 
ValueCountFrequency (%) 
2020041619587.2%
 
20200116410615.0%
 
2019102519477.1%
 
2019042519467.1%
 
2019020719677.2%
 
2018101019607.2%
 
2018042619507.1%
 
2018020119587.2%
 
2017110619637.2%
 
2017042619577.1%
 

Project Type
Categorical

HIGH CORRELATION

Distinct40
Distinct (%)0.1%
Missing1
Missing (%)< 0.1%
Memory size214.0 KiB
PV
7730 
PW
2941 
HD
2081 
P
1954 
HL
1804 
Other values (35)
10867 
ValueCountFrequency (%) 
PV773028.2%
 
PW294110.7%
 
HD20817.6%
 
P19547.1%
 
HL18046.6%
 
HB13785.0%
 
HW12214.5%
 
ED11514.2%
 
HR8243.0%
 
SE5021.8%
 
CS4811.8%
 
HN4761.7%
 
AG4631.7%
 
CO3731.4%
 
E3511.3%
 
PO3381.2%
 
TF2941.1%
 
WP2871.0%
 
F2811.0%
 
S2631.0%
 
HO2530.9%
 
PU2170.8%
 
LN2160.8%
 
HH1900.7%
 
C1460.5%
 
Other values (15)11624.2%
 
2020-12-12T15:10:08.990497image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T15:10:09.068564image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length3
Median length2
Mean length1.882569947
Min length1

Overview of Unicode Properties

Unique unicode characters22
Unique unicode categories2 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
P1361026.4%
 
H851916.5%
 
V773015.0%
 
W46088.9%
 
D33106.4%
 
L22964.5%
 
E21544.2%
 
B16133.1%
 
S12662.5%
 
C10001.9%
 
R9801.9%
 
O9641.9%
 
F6931.3%
 
N6921.3%
 
A6831.3%
 
T5021.0%
 
G4630.9%
 
U2170.4%
 
M1490.3%
 
Q890.2%
 
n2< 0.1%
 
a1< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter51538> 99.9%
 
Lowercase Letter3< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
P1361026.4%
 
H851916.5%
 
V773015.0%
 
W46088.9%
 
D33106.4%
 
L22964.5%
 
E21544.2%
 
B16133.1%
 
S12662.5%
 
C10001.9%
 
R9801.9%
 
O9641.9%
 
F6931.3%
 
N6921.3%
 
A6831.3%
 
T5021.0%
 
G4630.9%
 
U2170.4%
 
M1490.3%
 
Q890.2%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n266.7%
 
a133.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin51541100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
P1361026.4%
 
H851916.5%
 
V773015.0%
 
W46088.9%
 
D33106.4%
 
L22964.5%
 
E21544.2%
 
B16133.1%
 
S12662.5%
 
C10001.9%
 
R9801.9%
 
O9641.9%
 
F6931.3%
 
N6921.3%
 
A6831.3%
 
T5021.0%
 
G4630.9%
 
U2170.4%
 
M1490.3%
 
Q890.2%
 
n2< 0.1%
 
a1< 0.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII51541100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
P1361026.4%
 
H851916.5%
 
V773015.0%
 
W46088.9%
 
D33106.4%
 
L22964.5%
 
E21544.2%
 
B16133.1%
 
S12662.5%
 
C10001.9%
 
R9801.9%
 
O9641.9%
 
F6931.3%
 
N6921.3%
 
A6831.3%
 
T5021.0%
 
G4630.9%
 
U2170.4%
 
M1490.3%
 
Q890.2%
 
n2< 0.1%
 
a1< 0.1%
 

Project Type Description
Categorical

HIGH CORRELATION

Distinct41
Distinct (%)0.1%
Missing1
Missing (%)< 0.1%
Memory size214.0 KiB
CULTURAL INSTITUTIONS
7730 
PUBLIC BUILDINGS
2941 
HOUSING & DEVELOPMENT
2081 
PARKS
1954 
HEALTH
1804 
Other values (36)
10867 
ValueCountFrequency (%) 
CULTURAL INSTITUTIONS773028.2%
 
PUBLIC BUILDINGS294110.7%
 
HOUSING & DEVELOPMENT20817.6%
 
PARKS19547.1%
 
HEALTH18046.6%
 
HIGHWAY BRIDGES13785.0%
 
HIGHWAYS12214.5%
 
ECONOMIC DEVELOPMENT11514.2%
 
HUMAN RESOURCES8243.0%
 
SEWERS5021.8%
 
ADMIN FOR CHILDREN'S SERVICES4811.8%
 
HIGHER EDUCATION4761.7%
 
DEPARTMENT FOR THE AGING4631.7%
 
COURTS3731.4%
 
EDUCATION3511.3%
 
POLICE3381.2%
 
TRAFFIC2941.1%
 
WATER POLLUTION CONTROL2871.0%
 
FIRE2811.0%
 
SANITATION2631.0%
 
HEALTH & HOSPITALS CORP.2530.9%
 
NEW YORK PUBLIC LIBRARY2160.8%
 
HOMELESS SERVICES1900.7%
 
EDP EQUIP & FINANC COSTS1780.7%
 
CORRECTION1460.5%
 
Other values (16)12014.4%
 
2020-12-12T15:10:09.154138image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T15:10:09.234207image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length31
Median length17
Mean length15.94141281
Min length3

Overview of Unicode Properties

Unique unicode characters30
Unique unicode categories4 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
I4728010.8%
 
T425479.7%
 
U356998.2%
 
S322977.4%
 
N313247.2%
 
L298656.8%
 
262726.0%
 
E252145.8%
 
O226445.2%
 
R209334.8%
 
A206794.7%
 
C185024.2%
 
H150473.4%
 
P111672.6%
 
G107522.5%
 
D102732.4%
 
B85472.0%
 
M67351.5%
 
W40780.9%
 
V40210.9%
 
Y39410.9%
 
&26690.6%
 
K23570.5%
 
F21090.5%
 
Q4990.1%
 
Other values (5)9930.2%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter40651093.1%
 
Space Separator262726.0%
 
Other Punctuation36590.8%
 
Lowercase Letter3< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
I4728011.6%
 
T4254710.5%
 
U356998.8%
 
S322977.9%
 
N313247.7%
 
L298657.3%
 
E252146.2%
 
O226445.6%
 
R209335.1%
 
A206795.1%
 
C185024.6%
 
H150473.7%
 
P111672.7%
 
G107522.6%
 
D102732.5%
 
B85472.1%
 
M67351.7%
 
W40781.0%
 
V40211.0%
 
Y39411.0%
 
K23570.6%
 
F21090.5%
 
Q4990.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
26272100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
&266972.9%
 
'48113.1%
 
.47012.8%
 
,391.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n266.7%
 
a133.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin40651393.1%
 
Common299316.9%
 

Most frequent Latin characters

ValueCountFrequency (%) 
I4728011.6%
 
T4254710.5%
 
U356998.8%
 
S322977.9%
 
N313247.7%
 
L298657.3%
 
E252146.2%
 
O226445.6%
 
R209335.1%
 
A206795.1%
 
C185024.6%
 
H150473.7%
 
P111672.7%
 
G107522.6%
 
D102732.5%
 
B85472.1%
 
M67351.7%
 
W40781.0%
 
V40211.0%
 
Y39411.0%
 
K23570.6%
 
F21090.5%
 
Q4990.1%
 
n2< 0.1%
 
a1< 0.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
2627287.8%
 
&26698.9%
 
'4811.6%
 
.4701.6%
 
,390.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII436444100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
I4728010.8%
 
T425479.7%
 
U356998.2%
 
S322977.4%
 
N313247.2%
 
L298656.8%
 
262726.0%
 
E252145.8%
 
O226445.2%
 
R209334.8%
 
A206794.7%
 
C185024.2%
 
H150473.4%
 
P111672.6%
 
G107522.5%
 
D102732.4%
 
B85472.0%
 
M67351.5%
 
W40780.9%
 
V40210.9%
 
Y39410.9%
 
&26690.6%
 
K23570.5%
 
F21090.5%
 
Q4990.1%
 
Other values (5)9930.2%
 

Budget Line
Categorical

HIGH CARDINALITY

Distinct4163
Distinct (%)15.2%
Missing1
Missing (%)< 0.1%
Memory size214.0 KiB
P 0245K
 
24
PV0175
 
24
P 0822
 
24
HB0215
 
24
P 0245R
 
24
Other values (4158)
27257 
ValueCountFrequency (%) 
P 0245K240.1%
 
PV0175240.1%
 
P 0822240.1%
 
HB0215240.1%
 
P 0245R240.1%
 
ED0075240.1%
 
ED0409240.1%
 
P 1008240.1%
 
TF0502240.1%
 
F 0109240.1%
 
HW1684240.1%
 
PU0015240.1%
 
FA0313240.1%
 
C 0075240.1%
 
HW0003240.1%
 
P 1322240.1%
 
HB1203240.1%
 
P 1018240.1%
 
HW0876240.1%
 
HW0349240.1%
 
S 0136240.1%
 
LB0104240.1%
 
HB1012240.1%
 
CS0003240.1%
 
HB1027240.1%
 
Other values (4138)2677797.8%
 
2020-12-12T15:10:09.321282image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique1771 ?
Unique (%)6.5%
2020-12-12T15:10:09.397347image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length8
Median length7
Mean length6.594601505
Min length3

Overview of Unicode Properties

Unique unicode characters40
Unique unicode categories5 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
02461313.6%
 
N143217.9%
 
P136277.5%
 
1121826.7%
 
D115226.4%
 
298045.4%
 
H85444.7%
 
V77354.3%
 
473704.1%
 
372194.0%
 
772054.0%
 
666153.7%
 
961553.4%
 
860183.3%
 
560113.3%
 
W46322.6%
 
32161.8%
 
M28431.6%
 
L23151.3%
 
E21651.2%
 
K20141.1%
 
-19601.1%
 
R16370.9%
 
B16280.9%
 
Q15640.9%
 
Other values (15)76324.2%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number9319251.6%
 
Uppercase Letter8217645.5%
 
Space Separator32161.8%
 
Dash Punctuation19601.1%
 
Lowercase Letter3< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
N1432117.4%
 
P1362716.6%
 
D1152214.0%
 
H854410.4%
 
V77359.4%
 
W46325.6%
 
M28433.5%
 
L23152.8%
 
E21652.6%
 
K20142.5%
 
R16372.0%
 
B16282.0%
 
Q15641.9%
 
X13511.6%
 
C13341.6%
 
S12751.6%
 
O9691.2%
 
A7560.9%
 
F7040.9%
 
T5080.6%
 
G4740.6%
 
U2230.3%
 
J16< 0.1%
 
Z9< 0.1%
 
Y6< 0.1%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
02461326.4%
 
11218213.1%
 
2980410.5%
 
473707.9%
 
372197.7%
 
772057.7%
 
666157.1%
 
961556.6%
 
860186.5%
 
560116.5%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
3216100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n266.7%
 
a133.3%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-1960100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Common9836854.5%
 
Latin8217945.5%
 

Most frequent Latin characters

ValueCountFrequency (%) 
N1432117.4%
 
P1362716.6%
 
D1152214.0%
 
H854410.4%
 
V77359.4%
 
W46325.6%
 
M28433.5%
 
L23152.8%
 
E21652.6%
 
K20142.5%
 
R16372.0%
 
B16282.0%
 
Q15641.9%
 
X13511.6%
 
C13341.6%
 
S12751.6%
 
O9691.2%
 
A7560.9%
 
F7040.9%
 
T5080.6%
 
G4740.6%
 
U2230.3%
 
J16< 0.1%
 
Z9< 0.1%
 
Y6< 0.1%
 
Other values (3)7< 0.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
02461325.0%
 
11218212.4%
 
2980410.0%
 
473707.5%
 
372197.3%
 
772057.3%
 
666156.7%
 
961556.3%
 
860186.1%
 
560116.1%
 
32163.3%
 
-19602.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII180547100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
02461313.6%
 
N143217.9%
 
P136277.5%
 
1121826.7%
 
D115226.4%
 
298045.4%
 
H85444.7%
 
V77354.3%
 
473704.1%
 
372194.0%
 
772054.0%
 
666153.7%
 
961553.4%
 
860183.3%
 
560113.3%
 
W46322.6%
 
32161.8%
 
M28431.6%
 
L23151.3%
 
E21651.2%
 
K20141.1%
 
-19601.1%
 
R16370.9%
 
B16280.9%
 
Q15640.9%
 
Other values (15)76324.2%
 

Budget Line Description
Categorical

HIGH CARDINALITY

Distinct1899
Distinct (%)7.0%
Missing165
Missing (%)0.6%
Memory size214.0 KiB
MISCELLANEOUS PARKS, PLAYGROUNDS, CONSTRUCTION, RECONSTRUCTI
 
123
CONSTRUCTION OR ACQUISITION OF A NON-CITY OWNED PUBLIC BETTERMENT
 
98
CONSTRUCTION AND IMPROVEMENTS TO CUNY COMMUNITY COLLEGES, CITYWIDE
 
66
FIVE YEAR EDUCATIONAL FACILITIES CAPITAL PLAN
 
65
CONSTRUCTION, IMPROVEMENTS, ACQUISITION, ALL CULTURAL INSTITUTIONS
 
62
Other values (1894)
26799 
ValueCountFrequency (%) 
MISCELLANEOUS PARKS, PLAYGROUNDS, CONSTRUCTION, RECONSTRUCTI1230.4%
 
CONSTRUCTION OR ACQUISITION OF A NON-CITY OWNED PUBLIC BETTERMENT980.4%
 
CONSTRUCTION AND IMPROVEMENTS TO CUNY COMMUNITY COLLEGES, CITYWIDE660.2%
 
FIVE YEAR EDUCATIONAL FACILITIES CAPITAL PLAN650.2%
 
CONSTRUCTION, IMPROVEMENTS, ACQUISITION, ALL CULTURAL INSTITUTIONS620.2%
 
SEVENTH REGIMENT ARMORY CONSERVANCY560.2%
 
HENRY STREET SETTLEMENT560.2%
 
QUALITY SERVICES FOR THE AUTISM COMMUNITY INC. (QSAC)530.2%
 
MUSEUM OF CITY OF N. Y. IMPROVEMENTS510.2%
 
CITY HARVEST, INC.510.2%
 
JAMAICA ARTS CENTER, RECONSTRUCTION AND IMPROVEMENTS500.2%
 
PLANNED PARENTHOOD OF NEW YORK CITY500.2%
 
GOD'S LOVE WE DELIVER, INC.490.2%
 
ABC NO RIO480.2%
 
EDUCATIONAL ALLIANCE480.2%
 
EYEBEAM, INC.470.2%
 
NEW YORK RESTORATION PROJECT (NYRP)470.2%
 
PREGONES THEATER460.2%
 
SOUTH STREET SEAPORT MUSEUM450.2%
 
JEWISH BOARD OF FAMILY AND CHILDREN'S SERVICES440.2%
 
GOOD SHEPHERD SERVICES440.2%
 
ASIAN AMERICANS FOR EQUALITY, INC. (AAFE)430.2%
 
BALLET HISPANICO420.2%
 
ST. ANN'S WAREHOUSE/ARTS AT ST. ANN'S420.2%
 
MUSEUM OF THE MOVING IMAGE, THE AMERICAN420.2%
 
Other values (1874)2584594.4%
 
(Missing)1650.6%
 
2020-12-12T15:10:09.479418image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique68 ?
Unique (%)0.2%
2020-12-12T15:10:09.571497image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length70
Median length37
Mean length39.05285266
Min length3

Overview of Unicode Properties

Unique unicode characters58
Unique unicode categories10 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
12125411.3%
 
E959229.0%
 
N769847.2%
 
T766667.2%
 
O728866.8%
 
I707216.6%
 
R705026.6%
 
A645176.0%
 
S570755.3%
 
C521024.9%
 
L347703.3%
 
U306992.9%
 
M304662.8%
 
D255762.4%
 
H247812.3%
 
P219492.1%
 
Y180261.7%
 
,175701.6%
 
F154821.4%
 
B140041.3%
 
G130801.2%
 
V116991.1%
 
W111591.0%
 
K82140.8%
 
.76400.7%
 
Other values (33)254452.4%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter90607584.7%
 
Space Separator12125411.3%
 
Other Punctuation300782.8%
 
Decimal Number65640.6%
 
Dash Punctuation19480.2%
 
Open Punctuation13400.1%
 
Close Punctuation13280.1%
 
Lowercase Letter5350.1%
 
Currency Symbol46< 0.1%
 
Control21< 0.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n33061.7%
 
a16530.8%
 
t142.6%
 
h142.6%
 
o81.5%
 
x40.7%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
E9592210.6%
 
N769848.5%
 
T766668.5%
 
O728868.0%
 
I707217.8%
 
R705027.8%
 
A645177.1%
 
S570756.3%
 
C521025.8%
 
L347703.8%
 
U306993.4%
 
M304663.4%
 
D255762.8%
 
H247812.7%
 
P219492.4%
 
Y180262.0%
 
F154821.7%
 
B140041.5%
 
G130801.4%
 
V116991.3%
 
W111591.2%
 
K82140.9%
 
Q40750.4%
 
J19140.2%
 
X18900.2%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-1948100.0%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
121254100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
,1757058.4%
 
.764025.4%
 
&18166.0%
 
/17145.7%
 
'10553.5%
 
:1830.6%
 
"560.2%
 
#200.1%
 
;12< 0.1%
 
!12< 0.1%
 

Most frequent Open Punctuation characters

ValueCountFrequency (%) 
(1340100.0%
 

Most frequent Close Punctuation characters

ValueCountFrequency (%) 
)1328100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
1147322.4%
 
296714.7%
 
080412.2%
 
367410.3%
 
55969.1%
 
45197.9%
 
84406.7%
 
94256.5%
 
73675.6%
 
62994.6%
 

Most frequent Control characters

ValueCountFrequency (%) 
21100.0%
 

Most frequent Currency Symbol characters

ValueCountFrequency (%) 
$46100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin90661084.8%
 
Common16257915.2%
 

Most frequent Latin characters

ValueCountFrequency (%) 
E9592210.6%
 
N769848.5%
 
T766668.5%
 
O728868.0%
 
I707217.8%
 
R705027.8%
 
A645177.1%
 
S570756.3%
 
C521025.7%
 
L347703.8%
 
U306993.4%
 
M304663.4%
 
D255762.8%
 
H247812.7%
 
P219492.4%
 
Y180262.0%
 
F154821.7%
 
B140041.5%
 
G130801.4%
 
V116991.3%
 
W111591.2%
 
K82140.9%
 
Q40750.4%
 
J19140.2%
 
X18900.2%
 
Other values (7)14510.2%
 

Most frequent Common characters

ValueCountFrequency (%) 
12125474.6%
 
,1757010.8%
 
.76404.7%
 
-19481.2%
 
&18161.1%
 
/17141.1%
 
114730.9%
 
(13400.8%
 
)13280.8%
 
'10550.6%
 
29670.6%
 
08040.5%
 
36740.4%
 
55960.4%
 
45190.3%
 
84400.3%
 
94250.3%
 
73670.2%
 
62990.2%
 
:1830.1%
 
"56< 0.1%
 
$46< 0.1%
 
21< 0.1%
 
#20< 0.1%
 
;12< 0.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII1069189100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
12125411.3%
 
E959229.0%
 
N769847.2%
 
T766667.2%
 
O728866.8%
 
I707216.6%
 
R705026.6%
 
A645176.0%
 
S570755.3%
 
C521024.9%
 
L347703.3%
 
U306992.9%
 
M304662.8%
 
D255762.4%
 
H247812.3%
 
P219492.1%
 
Y180261.7%
 
,175701.6%
 
F154821.4%
 
B140041.3%
 
G130801.2%
 
V116991.1%
 
W111591.0%
 
K82140.8%
 
.76400.7%
 
Other values (33)254452.4%
 

Funding Type
Categorical

Distinct2
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Memory size214.0 KiB
CITY
23418 
NON CITY
3959 
ValueCountFrequency (%) 
CITY2341885.5%
 
NON CITY395914.5%
 
(Missing)1< 0.1%
 
2020-12-12T15:10:09.654569image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T15:10:09.696605image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:09.738641image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length8
Median length4
Mean length4.578384104
Min length3

Overview of Unicode Properties

Unique unicode characters9
Unique unicode categories3 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
C2737721.8%
 
I2737721.8%
 
T2737721.8%
 
Y2737721.8%
 
N79186.3%
 
O39593.2%
 
39593.2%
 
n2< 0.1%
 
a1< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter12138596.8%
 
Space Separator39593.2%
 
Lowercase Letter3< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
C2737722.6%
 
I2737722.6%
 
T2737722.6%
 
Y2737722.6%
 
N79186.5%
 
O39593.3%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
3959100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n266.7%
 
a133.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin12138896.8%
 
Common39593.2%
 

Most frequent Latin characters

ValueCountFrequency (%) 
C2737722.6%
 
I2737722.6%
 
T2737722.6%
 
Y2737722.6%
 
N79186.5%
 
O39593.3%
 
n2< 0.1%
 
a1< 0.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
3959100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII125347100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
C2737721.8%
 
I2737721.8%
 
T2737721.8%
 
Y2737721.8%
 
N79186.3%
 
O39593.2%
 
39593.2%
 
n2< 0.1%
 
a1< 0.1%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size214.0 KiB
5
17738 
4
9640 
ValueCountFrequency (%) 
51773864.8%
 
4964035.2%
 
2020-12-12T15:10:09.801695image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T15:10:09.845733image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:09.886768image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters2
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
51773864.8%
 
4964035.2%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number27378100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
51773864.8%
 
4964035.2%
 

Most occurring scripts

ValueCountFrequency (%) 
Common27378100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
51773864.8%
 
4964035.2%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII27378100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
51773864.8%
 
4964035.2%
 

First Fiscal Year
Real number (ℝ≥0)

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean2018.452095
Minimum2016
Maximum2020
Zeros0
Zeros (%)0.0%
Memory size214.0 KiB
2020-12-12T15:10:09.940315image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum2016
5-th percentile2016
Q12017
median2019
Q32020
95-th percentile2020
Maximum2020
Range4
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.291061728
Coefficient of variation (CV)0.0006396296107
Kurtosis-1.149156571
Mean2018.452095
Median Absolute Deviation (MAD)1
Skewness-0.2606054237
Sum55259163
Variance1.666840384
MonotocityNot monotonic
2020-12-12T15:10:09.999866image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=5)
ValueCountFrequency (%) 
2020801129.3%
 
2019587321.5%
 
2018587121.4%
 
2017572620.9%
 
201618966.9%
 
(Missing)1< 0.1%
 
ValueCountFrequency (%) 
201618966.9%
 
2017572620.9%
 
2018587121.4%
 
2019587321.5%
 
2020801129.3%
 
ValueCountFrequency (%) 
2020801129.3%
 
2019587321.5%
 
2018587121.4%
 
2017572620.9%
 
201618966.9%
 

Fiscal Year 1 Amount
Real number (ℝ)

HIGH CORRELATION
SKEWED
ZEROS

Distinct5388
Distinct (%)19.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9730.912521
Minimum-50216
Maximum19242283
Zeros10517
Zeros (%)38.4%
Memory size214.0 KiB
2020-12-12T15:10:10.078433image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum-50216
5-th percentile0
Q10
median77
Q31239
95-th percentile32175.9
Maximum19242283
Range19292499
Interquartile range (IQR)1239

Descriptive statistics

Standard deviation138384.8173
Coefficient of variation (CV)14.22115521
Kurtosis13731.3725
Mean9730.912521
Median Absolute Deviation (MAD)77
Skewness102.9564547
Sum266412923
Variance1.915035766e+10
MonotocityNot monotonic
2020-12-12T15:10:10.160504image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
01051738.4%
 
5004171.5%
 
1002130.8%
 
2502000.7%
 
10001920.7%
 
501790.7%
 
11780.7%
 
351190.4%
 
3001060.4%
 
2001010.4%
 
400980.4%
 
150950.3%
 
2850.3%
 
40780.3%
 
2000750.3%
 
36740.3%
 
800670.2%
 
1500670.2%
 
750660.2%
 
44660.2%
 
350650.2%
 
60640.2%
 
30600.2%
 
3000540.2%
 
600500.2%
 
Other values (5363)1409251.5%
 
ValueCountFrequency (%) 
-502161< 0.1%
 
-406759< 0.1%
 
-406743< 0.1%
 
-349042< 0.1%
 
-3007712< 0.1%
 
-287071< 0.1%
 
-225871< 0.1%
 
-225841< 0.1%
 
-222541< 0.1%
 
-222321< 0.1%
 
ValueCountFrequency (%) 
192422831< 0.1%
 
36594331< 0.1%
 
36560151< 0.1%
 
32546451< 0.1%
 
32007202< 0.1%
 
28007201< 0.1%
 
27348621< 0.1%
 
27119021< 0.1%
 
26007201< 0.1%
 
25812202< 0.1%
 

Fiscal Year 2 Amount
Real number (ℝ)

HIGH CORRELATION
SKEWED
ZEROS

Distinct4625
Distinct (%)16.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9508.901381
Minimum-5799
Maximum17161904
Zeros16259
Zeros (%)59.4%
Memory size214.0 KiB
2020-12-12T15:10:10.246078image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum-5799
5-th percentile0
Q10
median0
Q3583.75
95-th percentile31006.9
Maximum17161904
Range17167703
Interquartile range (IQR)583.75

Descriptive statistics

Standard deviation125544.7638
Coefficient of variation (CV)13.20286738
Kurtosis12836.26011
Mean9508.901381
Median Absolute Deviation (MAD)0
Skewness98.30545073
Sum260334702
Variance1.576148772e+10
MonotocityNot monotonic
2020-12-12T15:10:10.325146image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
01625959.4%
 
5002560.9%
 
1001950.7%
 
10001720.6%
 
2501440.5%
 
2001060.4%
 
50780.3%
 
1700.3%
 
150680.2%
 
2000660.2%
 
750530.2%
 
400510.2%
 
300500.2%
 
1500500.2%
 
35450.2%
 
2410.1%
 
40390.1%
 
4000380.1%
 
3350.1%
 
4350.1%
 
350340.1%
 
46330.1%
 
3000330.1%
 
5000330.1%
 
36310.1%
 
Other values (4600)936334.2%
 
ValueCountFrequency (%) 
-57991< 0.1%
 
-3001< 0.1%
 
-161< 0.1%
 
-131< 0.1%
 
01625959.4%
 
1700.3%
 
2410.1%
 
3350.1%
 
4350.1%
 
5150.1%
 
ValueCountFrequency (%) 
171619041< 0.1%
 
31931201< 0.1%
 
31726201< 0.1%
 
31242081< 0.1%
 
30035701< 0.1%
 
29708791< 0.1%
 
27378791< 0.1%
 
25528201< 0.1%
 
25149301< 0.1%
 
25047701< 0.1%
 

Fiscal Year 3 Amount
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED
ZEROS

Distinct3266
Distinct (%)11.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7966.039667
Minimum0
Maximum12257537
Zeros19858
Zeros (%)72.5%
Memory size214.0 KiB
2020-12-12T15:10:10.408718image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q350
95-th percentile24670.3
Maximum12257537
Range12257537
Interquartile range (IQR)50

Descriptive statistics

Standard deviation100105.2329
Coefficient of variation (CV)12.56649943
Kurtosis8445.795657
Mean7966.039667
Median Absolute Deviation (MAD)0
Skewness76.32076244
Sum218094234
Variance1.002105766e+10
MonotocityNot monotonic
2020-12-12T15:10:10.497294image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
01985872.5%
 
5001700.6%
 
10001320.5%
 
100960.4%
 
250900.3%
 
200710.3%
 
2000600.2%
 
1550.2%
 
50470.2%
 
1500360.1%
 
300350.1%
 
400350.1%
 
750330.1%
 
150330.1%
 
700330.1%
 
4000310.1%
 
35310.1%
 
2310.1%
 
600280.1%
 
5280.1%
 
3280.1%
 
52260.1%
 
40250.1%
 
5000250.1%
 
450230.1%
 
Other values (3241)631823.1%
 
ValueCountFrequency (%) 
01985872.5%
 
1550.2%
 
2310.1%
 
3280.1%
 
4190.1%
 
5280.1%
 
68< 0.1%
 
79< 0.1%
 
8150.1%
 
9190.1%
 
ValueCountFrequency (%) 
122575371< 0.1%
 
35093201< 0.1%
 
34313201< 0.1%
 
31053201< 0.1%
 
27078791< 0.1%
 
27053791< 0.1%
 
26976791< 0.1%
 
25528201< 0.1%
 
25028201< 0.1%
 
21955691< 0.1%
 

Fiscal Year 4 Amount
Real number (ℝ)

HIGH CORRELATION
SKEWED
ZEROS

Distinct2659
Distinct (%)9.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7158.49474
Minimum-2494
Maximum11175734
Zeros21110
Zeros (%)77.1%
Memory size214.0 KiB
2020-12-12T15:10:10.582367image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum-2494
5-th percentile0
Q10
median0
Q30
95-th percentile17050.25
Maximum11175734
Range11178228
Interquartile range (IQR)0

Descriptive statistics

Standard deviation97088.433
Coefficient of variation (CV)13.56268832
Kurtosis6759.432391
Mean7158.49474
Median Absolute Deviation (MAD)0
Skewness67.77666991
Sum195985269
Variance9426163822
MonotocityNot monotonic
2020-12-12T15:10:10.660935image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
02111077.1%
 
5001270.5%
 
1000970.4%
 
100820.3%
 
250630.2%
 
200620.2%
 
1500.2%
 
50470.2%
 
2410.1%
 
2000400.1%
 
5390.1%
 
3380.1%
 
300340.1%
 
10280.1%
 
400280.1%
 
750270.1%
 
5000250.1%
 
3000250.1%
 
150240.1%
 
4000240.1%
 
600230.1%
 
1500220.1%
 
25220.1%
 
35220.1%
 
36190.1%
 
Other values (2634)525919.2%
 
ValueCountFrequency (%) 
-24941< 0.1%
 
02111077.1%
 
1500.2%
 
2410.1%
 
3380.1%
 
410< 0.1%
 
5390.1%
 
6160.1%
 
7150.1%
 
8140.1%
 
ValueCountFrequency (%) 
111757341< 0.1%
 
35883801< 0.1%
 
34313201< 0.1%
 
34263201< 0.1%
 
34143801< 0.1%
 
30883801< 0.1%
 
26926201< 0.1%
 
22263162< 0.1%
 
21932151< 0.1%
 
21655693< 0.1%
 

Fiscal Year 5 Amount
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
SKEWED
ZEROS

Distinct1594
Distinct (%)9.0%
Missing9640
Missing (%)35.2%
Infinite0
Infinite (%)0.0%
Mean5955.045383
Minimum0
Maximum3414380
Zeros14676
Zeros (%)53.6%
Memory size214.0 KiB
2020-12-12T15:10:10.743005image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile10355.35
Maximum3414380
Range3414380
Interquartile range (IQR)0

Descriptive statistics

Standard deviation70000.65979
Coefficient of variation (CV)11.75484909
Kurtosis1280.733389
Mean5955.045383
Median Absolute Deviation (MAD)0
Skewness32.34754482
Sum105630595
Variance4900092372
MonotocityNot monotonic
2020-12-12T15:10:10.824075image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
01467653.6%
 
1000470.2%
 
50360.1%
 
100350.1%
 
2000310.1%
 
500310.1%
 
1240.1%
 
250220.1%
 
4000190.1%
 
2190.1%
 
300180.1%
 
6000180.1%
 
3000170.1%
 
10000160.1%
 
750140.1%
 
25140.1%
 
25000140.1%
 
15013< 0.1%
 
313< 0.1%
 
70013< 0.1%
 
1200012< 0.1%
 
40012< 0.1%
 
20012< 0.1%
 
1012< 0.1%
 
4012< 0.1%
 
Other values (1569)25889.5%
 
(Missing)964035.2%
 
ValueCountFrequency (%) 
01467653.6%
 
1240.1%
 
2190.1%
 
313< 0.1%
 
46< 0.1%
 
511< 0.1%
 
66< 0.1%
 
79< 0.1%
 
87< 0.1%
 
93< 0.1%
 
ValueCountFrequency (%) 
34143802< 0.1%
 
29137802< 0.1%
 
22263161< 0.1%
 
22011131< 0.1%
 
21655692< 0.1%
 
20666671< 0.1%
 
20323291< 0.1%
 
14214891< 0.1%
 
9740891< 0.1%
 
7822951< 0.1%
 

Interactions

2020-12-12T15:10:03.905121image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:03.999202image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:04.093283image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:04.183361image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:04.271436image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:04.364016image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:04.454093image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:04.542169image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:04.631746image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:04.721323image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:04.810400image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:04.896474image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:04.985551image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:05.072626image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:05.160201image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:05.247276image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:05.336353image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:05.422927image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:05.505498image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:05.589571image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:05.673643image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:05.758716image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:05.843289image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:05.927861image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:06.011934image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:06.093504image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:06.176575image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:06.258146image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:06.342218image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:06.431795image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:06.520371image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:06.606946image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:06.689517image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:06.774089image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:06.859163image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:06.944236image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:07.031311image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:07.117886image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:07.202458image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:07.285030image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:07.372104image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:07.456677image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:07.540750image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:07.629326image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:07.716401image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:07.800974image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:07.883044image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:07.968117image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:08.051689image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Correlations

2020-12-12T15:10:10.899640image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-12-12T15:10:11.009735image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-12-12T15:10:11.120330image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-12-12T15:10:11.235429image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2020-12-12T15:10:11.348026image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2020-12-12T15:10:08.233346image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:08.411999image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:08.570135image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:10:08.675726image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Sample

First rows

Published DateProject TypeProject Type DescriptionBudget LineBudget Line DescriptionFunding TypeNumber of Years PresentedFirst Fiscal YearFiscal Year 1 AmountFiscal Year 2 AmountFiscal Year 3 AmountFiscal Year 4 AmountFiscal Year 5 Amount
020160426.0HDHOUSING & DEVELOPMENTHDKN525NaNCITY52016.00500000.0
120160426.0EEDUCATIONE M4001FITCITY52016.040150000.0
220160426.0AGDEPARTMENT FOR THE AGINGAGDN100CHINESE-AMERICAN PLANNING COUNCILCITY52016.05030000.0
320160426.0AGDEPARTMENT FOR THE AGINGAGDN145ELMCOR YOUTH AND ADULT ACTIVITIES, INC.CITY52016.00051000.0
420160426.0AGDEPARTMENT FOR THE AGINGAGDN169GLENRIDGE SENIOR CENTERCITY52016.00118000.0
520160426.0AGDEPARTMENT FOR THE AGINGAGDN184HEBREW HOME FOR THE AGEDCITY52016.016580114900.0
620160426.0AGDEPARTMENT FOR THE AGINGAGDN216JEWISH COMMUNITY COUNCIL OF GREATER CONEY ISLAND (JCCGCI)CITY52016.0119331000.0
720160426.0AGDEPARTMENT FOR THE AGINGAGDN235LENOX HILL NEIGHBORHOOD ASSOCIATIONCITY52016.02360035960.0
820160426.0AGDEPARTMENT FOR THE AGINGAGDN262MET COUNCIL ON JEWISH POVERTYCITY52016.028216429072570.0
920160426.0AGDEPARTMENT FOR THE AGINGAGDN334PRESBYTERIAN SENIOR SERVICESCITY52016.0500000.0

Last rows

Published DateProject TypeProject Type DescriptionBudget LineBudget Line DescriptionFunding TypeNumber of Years PresentedFirst Fiscal YearFiscal Year 1 AmountFiscal Year 2 AmountFiscal Year 3 AmountFiscal Year 4 AmountFiscal Year 5 Amount
2736820200416.0WPWATER POLLUTION CONTROLWP0247UPGRADE JAMAICA WATER POLLUTION CONTROL PROJECTCITY52020.020300000.0
2736920200416.0WPWATER POLLUTION CONTROLWP0249UPGRADE TALLMANS ISLAND WATER POLLUTION CONTROL PROJECTCITY52020.0197010676000.0
2737020200416.0WPWATER POLLUTION CONTROLWP0269CONSTRUCTION, RECONSTRUCTION OF PUMPING STATION/FORCE MAINS, CITYWIDECITY52020.0187571428733826120156542414.0
2737120200416.0WPWATER POLLUTION CONTROLWP0269CONSTRUCTION, RECONSTRUCTION OF PUMPING STATION/FORCE MAINS, CITYWIDENON CITY52020.010601980002700.0
2737220200416.0WPWATER POLLUTION CONTROLWP0282ENG., ARCH., ADMIN. AND OTHER COSTS, DEPT. OF WATER RESOURCESCITY52020.06534834421239615306240590.0
2737320200416.0WPWATER POLLUTION CONTROLWP0283UPGRADE NEWTOWN CREEK WATER POLLUTION CONTROL PROJECTCITY52020.0-36233039000.0
2737420200416.0WPWATER POLLUTION CONTROLWP0284CITY-WIDE SLUDGE DISPOSAL FACILITIESCITY52020.0-9430000.0
2737520200416.0WPWATER POLLUTION CONTROLWP0285BIONUTRIENT REMOVAL FACILITIES, CITYWIDECITY52020.0142910002000193640.0
2737620200416.0WPWATER POLLUTION CONTROLWP0287UPGRADE CONEY ISLAND WATER POLLUTION CONTROL PROJECTCITY52020.0-290000.0
2737720200416.0WPWATER POLLUTION CONTROLWP0288UPGRADE OWLS HEAD WATER POLLUTION CONTROL PROJECTCITY52020.0-5310000.0